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AMENDMENTS TO THE CLAIMS 

This listing of claims will replace all prior versions and listings of claims in the 
application. 

LISTING OF CLAIMS: 

1 . (Currently Amended) A method for converting a generic document, wherein a 
generic document comprises a document in a particular format type, into cr e ating a structured 
document, wherein a structured document includes a plurality of content elements wrapped in 
pairs of hierarchically nested tags, comprising: 

parsing a-the document of a-the particular format t ype containing content into a plurality 
of content elements; and 

for a selected content element, suggesting an optimal tag according to a tag suggestion 
procedure; 

wherein the tag suggestion procedure comprises: 
providing sample data in the form of structured sample documents; 
analyzing patterns in the sample data to derive a set of tag suggestions; 
deriving a set of candidate tags from the set of tag suggestions for the selected content 
element; and 

evaluating the set of candidate tags according to tag suggestion criteria to determine an 
optimal tag for the selected content element. 

2. (Original) The method of claim 1, wherein the tag suggestion criteria 
comprises satisfying a similarity function. 

3. (Original) The method of claim 1, wherein the set of tag suggestions are 
generated during creation of the structured document. 

4. (Original) The method of claim 1, wherein the set of tag suggestions are 
generated prior to creation of the structured document. 
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5. (Original) The method of claim 1, wherein the structured sample document 
comprises an XML document having a DTD associated with it. 

6. (Original) The method of claim 1, wherein the set of tag suggestions includes 
tree patterns of tags. 

7. (Original) The method of claim 1, wherein the optimal tag maximizes a 
similarity function with patterns found in the sample data. 

8. (Original) The method of claim 6, wherein the tag suggestion criteria 
comprises balancing size of tree patterns of tags and frequency of occurrence of tree patterns of 
tags in the sample data. 

9. (Original) The method of claim 1 , wherein the set of tag suggestions includes 
a set of tree patterns of tags U e T, and a set C of candidates is a set of all patterns in T with all 
their prefixes, C = {c \ c is a prefix of U s T} ; 

wherein a similarity function between a candidate c € C and a tree pattern U e T 
satisfies: sim (c, U) = \c \l\ U |, if c is a tree-prefix of t i; 
sim (c, tj) = 0, otherwise; and 

wherein the optimal tag comprises a context-free candidate c e C that maximizes an 
aggregate similarity measure SIM (c,T), where SIM(c,T) = ^sim(c,t i ) • pr t . 

10. (Original) The method of claim 9, wherein a candidate set in context t cxt is 
defined as C(t ctx )= { c e C | t cxt is a prefix of c} \ and 

wherein the optimal tag comprises a context-aware candidate c e C that maximizes an 
aggregate similarity measure SIM (c,T), where SIM(c,T) = ^sim^J^- pr t . 

r,er 
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11. (Currently Amended) A method for authoring of a structured document, 
wherein a structured document comprises a plurality of content elements wrapped in pairs of 
tags, comprising: 

generating content elements wrapped in pairs of tags; and 

for a selected tag, suggesting an optimal content fragment according to a cont e ntt content 
suggestion procedure; 

wherein the content suggestion procedure comprises: 
providing a sample structured document; 

deriving a set of content fragments from the sample structured document; 

evaluating the set of content fragments according to a content fragment suggestion criteria 
to determine an optimal content fragment suggestion for the tag, wherein the optimal content 
fragment suggestion is the most probable content fragment for the selected tag. 

12. (Original) The method of claim 11, further comprising assigning a score to 
each content fragment in the set of content fragments, wherein the score is a ratio of number of 
occurrences of the content fragment under the selected tag and number of occurrences of the 
selected tag in the sample structured document. 

13. (Original) The method of claim 12, wherein the optimal content fragment 
suggestion is the content fragment with the highest score. 

14. (Original) The method of claim 12, further comprising assigning a context to 
each content fragment in the set of content fragments, wherein context comprises the structural 
context of the tag surrounding the content fragment. 

15. (Original) The method of claim 12, wherein the optimal content fragment 
suggestion is the content fragment with the highest score greater than a threshold value. 

16. (Original) The method of claim 14, wherein each content fragment is 
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referenced by a partial path from the sample structured document root and the context comprises 
the partial path of the content fragment in the sample structured document. 

1 7. (Original) The method of claim 1 1 , further comprising: 

selecting a small linguistic unit within each content fragment in the set of content 
fragments; and 

assigning a score to the small linguistic unit, wherein the score is a ratio of number of 
occurrences of the linguistic unit under the selected tag and number of occurrences of the 
selected tag in the sample structured document. 

18. (Original) The method of claim 17, wherein the small linguistic unit is a 
word, a phrase or a sentence. 

19 (Original) The method of claim 14, wherein the context of each content 
fragment in the set of content fragments comprises the structural tree around the tag surrounding 
the content fragment. 

20. (Original) The method of claim 1 , wherein content comprises text. 
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